# OS Performance Analysis on Intel® Core<sup>TM</sup>2 Duo Processors



# Software and Solutions Group

• David Levinthal, Sr SW Engineer May 4 2007



# Precise Events Based Sampling (PEBS) on Core2

#### •Mechanism:

- counter overflow arms pebs
- Next event gets captured and raises PMI
- Pebs mechanism captures architectural state information at completion of critical instruction

### •Including EIP (+1), even when OS defers PMI

Accurate inst\_retired profile

```
inst retired.any p
x87 ops retires.any
Br inst retired.mispred
simd inst retired.any
mem load retired.dtlb miss
mem load retired. I1d line miss
mem load retired. I1d miss
mem load retired. 12 line miss
mem load retired.l2 miss
```





# PEBS BUFFER

#### **DS Buffer Managment**

| 63 | BTS Buffer Base          | 0 | ОН  |
|----|--------------------------|---|-----|
|    | BTS Index                |   | 8H  |
|    | BTS Absolute Maximum     |   | 10H |
|    | BTS Interrupt Threshold  |   | 18H |
|    | PEBS Buffer Base         |   | 20H |
|    | PEBS Index               |   | 28H |
|    | PEBS Absolute Maximum    |   | 30H |
|    | PEBS Interrupt Threshold |   | 38H |
|    | PEBS Counter Reset0      |   | 40H |

#### **PEBS** Record

| 63 | RFLAGS | 0 | ОН  |
|----|--------|---|-----|
|    | RIP    |   | 8H  |
|    | RAX    |   | 10H |
|    | RBX    |   | 18H |
|    | RCX    |   | 20H |
|    | RDX    |   | 28H |
|    | RSI    |   | 30H |
|    | RDI    |   | 38H |
|    | RBP    |   | 40H |
|    | RSP    |   | 48H |
|    | R8     |   | 50H |
|    |        |   |     |
|    | R15    |   | 88H |

Merom/Penryn - Format 0000b





# VTune™ Analyzer Edit Event







## Some Features of the PMU



Setting CMASK = 1 and INV = 1 for INST\_RETIRED.ANY\_P
Counts Cycles Where
no instructions were retired
Even in OS "Critical Sections" where PMI is deferred





# Some Features of the PMU



Setting CMASK = 8 and INV = 1 for INST\_RETIRED.ANY\_P
Counts ALL Cycles
Accurate CPU Cycle Profile for entire OS





# Some Features of the PMU



Setting CMASK = 8, INV = 1, OS=1, USR=0 for INST\_RETIRED.ANY\_P
Counts All Cycles in Ring0





